Extended HMM and Ranking Models for Chinese Spelling Correction

نویسندگان

  • Jinhua Xiong
  • Qiao Zhang
  • Jianpeng Hou
  • Qianbo Wang
  • Yuanzhuo Wang
  • Xueqi Cheng
چکیده

Spelling correction has been studied for many decades, which can be classified into two categories: (1) regular text spelling correction, (2) query spelling correction. Although the two tasks share many common techniques, they have different concerns. This paper presents our work on the CLP-2014 bake-off. The task focuses on spelling checking on foreigner Chinese essays. Compared to online search query spelling checking task, more complicated techniques can be applied for better performance. Therefore, we proposed a unified framework for Chinese essays spelling correction based on extended HMM and ranker-based models, together with a rule-based model for further polishing. Our system showed better performance on the test dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HANSpeller: A Unified Framework for Chinese Spelling Correction

Increased interest in China from foreigners has led to a corresponding interest in the study of Chinese. However, the learning of Chinese by non-native speakers will encounter many difficulties, Chinese spelling check techniques for Chinese as a Foreign Language(CFL) learners is highly desirable. This paper presents our work on the SIGHAN-2015 Chinese Spelling Check task. The task focuses on sp...

متن کامل

Spelling Correction Based on User Search Contextual Analysis and Domain Knowledge

We propose a spelling correction algorithm that combines trusted domain knowledge and query log information for query spelling correction. This algorithm uses query reformulations in the query log and bigram language models built from queries for efficiently and effectively generating correction suggestions and ranking them to find valid corrections. Experimental results show that for both simp...

متن کامل

CSE 256 ( Spring 2004 ) “ Language Models for Spelling Correction ”

This project examines the use of language models in a spelling correction system that adopts the “Noisy Channel Model”. Various models based on bigram counts are tested in an experiment where typos are introduced into a test corpus, and corrections are made by language model ranking alone. Simple bigram models perform noticeably better than the unigram model (84% accuracy vs. 74%). And more sop...

متن کامل

Chinese Word Spelling Correction Based on N-gram Ranked Inverted Index List

Spelling correction can assist individuals to input text data with machine using written language to obtain relevant information efficiently and effectively in. By referring to relevant applications such as web search, writing systems, recommend systems, document mining, typos checking before printing is very close to spelling correction. Individuals can input text, keyword, sentence how to int...

متن کامل

Discriminative Reranking for Spelling Correction

This paper proposes a novel approach to spelling correction. It reranks the output of an existing spelling corrector, Aspell. A discriminative model (Ranking SVM) is employed to improve upon the initial ranking, using additional features as evidence. These features are derived from stateof-the-art techniques in spelling correction, including edit distance, letter-based n-gram, phonetic similari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014